Jamming Model for the Extremal Optimization Heuristic 



O 

o 

(N 
o 

Q 

(N 



I 

o 
o 



> 



o 



X3 
O 

o 



X 
J3 



Stefan Boettcher^' ''' and Michelangelo Grigni^'-'- 

^Physics Department, Emory University, Atlanta, Georgia 30322, USA 
^ Dept. of Mathematics and Computer Sciences, 
Emory University, Atlanta, Georgia 30322, USA 

(Dated: February 6, 2008) 

Extremal Optimization, a recently introduced meta-heuristic for hard optimization problems, 
is analyzed on a simple model of jamming. The model is motivated first by the problem of 
finding lowest energy configurations for a disordered spin system on a fixed-valence graph. The 
numerical results for the spin system exhibits the same phenomenology found in all earlier 
studies of extremal optimization, and our analytical results for the model reproduce many of 
these features. 
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I. INTRODUCTION 

Many situations in physics and beyond require the 
solution of NP-hard optimization problems, for which 
the typical time needed to ascertain the exact solution 
apparently grows faster than any power of the system 
size Examples in the sciences are the determination 
of ground states for disordered magnets ^ ^, ^, 
optimal arrangements of atoms in a compound 
a polymer js). With the advent of ever faster comput- 
ers, the exact study of such problems has become feasi- 
ble 1^, Yet, with typically exponential complexity 
of these problems, many questions regarding those sys- 
tems still are only accessible via approximate, heuristic 
methods . Heuristics trade off the certainty of an ex- 
act result against finding optimal or near-optimal solu- 
tions with high probability in polynomial time. Many of 
these heuristics have been inspired by physical optimiza- 
tion processes, for instance, simulated annealing or 
genetic algorithms pSf . 

Extremal optimization (EO) was proposed re- 
cently [|^, and has been used to treat a variety of 
combinatorial |p^ , |l6| and physical optimization prob- 
lems l|5||. Comparative studies with simulated anneal- 
ing [|l4 and other Metropolis based heuristics Q 
have established EO as a successful alternative for the 
study of NP-hard problems, especially near phase transi- 
tions jl^ that are associated with the most complex in- 
stances of such problems @ |^ |2l|, |2|, H || . Recently, 
EO has also been successfully applied to Lennard-Jones 
glasses 1^. 

In this paper, we elucidate some properties of the EO 
algorithm with analytical means. We motivate our the- 
oretical model system with a brief study of a disordered 
spin system on a random graph. EO applied to find- 
ing ground states of this system reveal the same generic 
properties found for the algorithm previously. From this 
problem, we can abstract a set of evolution equations 
which allow a complete analysis of EO as a function of its 



single parameter, r, and the system size, n. In particular, 
an optimal value for r as a function of n is determined 
in close analogy with the scaling found numerically in 
all previous studies |2^. We finish with a discussion of 
how this model can be used also to investigate alterna- 
tive versions of EO, or to analytically compare EO with 
simulated annealing and other local search heuristics. 



II. SPIN GLASSES ON FIXED- VALENCE 
RANDOM GRAPHS 

Disordered spin systems on random graphs have been 
investigated as mean-field models of spin glasses 
or optimization problems 28, |2^, p4| . 



since vari- 
ables are long-range connected yet have a small number 
of neighbors. Particularly simple are a-valent random 
graphs (28[ |2^, |l^ . In these graphs each vertex possesses 
a fixed number a of bonds to randomly selected other ver- 
tices. Specifically, we have used the method described in 
Ref. to generate these graphs which are also referred 
to as a-regular graphs. (Note that self loops or double 
connections are not allowed, and disconnected graphs are 
highly unlikely). Just as on a lattice, one can assign a 
spin variable Xi G { — 1,-1-1} to each vertex, and couplings 
Jij G {— 1,-fl} to existing bonds between neighboring 
vertices i and j. The energy of the system then is the 
difference between violated bonds and satisfied bonds. 



(1) 



{bonds} 



It is more convenient to consider a linearly related quan- 
tity, which merely tallies the number of violated bonds 
per spin in a configuration. 



H a 
2n 



>0, 



(2) 



where we have used the fact that each graph has a total 
of an/2 bonds. 
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Clearly, for all Jij = 1 the spin systera has two ferro- 
magnetic ground states with e = that are easy to find 
(all Xi — \ or all ~ — !)• But for anti-ferromagnetic 
bonds = —1, the ground state energy depends on the 
disordered structure of the graph itself. Only if all loops 
in the graph were of even length (like in a hyper-cubic lat- 
tice), there are again simple ground states, each with an 
alternating spin pattern (Neel state). Instead, in a ran- 
dom graph, the disorder creates loops that have an equal 
chance to be odd or even length. Thus, on average, half 
of the loops can have all bonds satisfied, the other half 
will have at least one bond frustrated. Since the length 
of loops in random graphs typically diverges with log(n), 
each odd loop almost certainly has other odd loops as a 
neighbors to share a violated bond with. In fact, even for 
a spin glass, Jij G {—1,-1-1}, the same argument should 
hold, since only half of the loops will be frustrated and 
neighboring frustrated loops can share violated bonds. 
We find that the average ground state energies found for 
either bond distribution are identical for n — > oo, in sup- 
port of the above argument, but the results appear to 
differ in next-to- leading order corrections. 



A r-EO algorithm for a-valent graphs 

To obtain the numerical results in Fig. 0, we used the 
following implementation of r-EO (see also Ref. Q): For 
a given spin configuration on a graph, assign to each spin 
Xi a "fitness" 



'^violated bonds 



-0,-1, 



-a, 



so that 



2n ^ 



(3) 



(4) 



is satisfied. Each spin falls into one of only a -I- 1 possible 
states. Say, currently there are spins with the worst 
fitness, A = —a, nQ,_i with A — —(a — 1), and so on 
up to no spins with the best fitness A = 0. (Note that 
n = Now draw a "rank" fc according to the 

distribution 



P(fc) = 



T - 1 



(1 < fc < n) 



(5) 



Determine < ?' < a such that V]^_,- , , rii < k < 
SiLj Finally, select any one of the rij spins in state 
j and reverse its spin unconditionally. As a result, it 
and its neighboring spins change their fitness. After all 
the effected A's and n's are reevaluated, the next spin is 
chosen for an update. 

This EO implementation updates spins with a (r- 
dcpcndent) bias against poorly adapted spins on behalf 
of Eq. (P). This process is "extremal" in the sense that 
it focuses on atypical variables, and it forms the basis of 



the EO method. The only adjustable parameter in this 
algorithm is the power-law exponent r. For r = 0, ran- 
domly selected spins get forced to update, resulting in 
merely a random walk through the configuration space. 
The search is ergodic but yields poor results. For r ^ oo, 
only spins in the worst state get updated which quickly 
traps the update process to a small region of the con- 
figuration space which may be far from a near-optimal 
solution. Ergodicity is broken in the sense that configu- 
rations far from the initial conditions are unlikely to be 
reached within a given runtime. The dependence of per- 
formance on T for this and all previous implementations 
of r-EO (for quite different optimization problems ||25|, [^) 
exhibits the features shown in Fig. ||: The best average 
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FIG. 1: Extrapolation for the number of violated bonds 
per spin, e, as a function of 1/n (a) for trivalent and (b) 
for 4-valent graphs of size n = 32, 64, . . . , 1024. Circles 
refer to an anti-ferromagnetic, and the squares to a ± J 
bond distribution. The error bars for (e) are smaller than 
the symbols. The data for the spin glass is independent 
of the way the a-valent graph was formed and is best fit 
(continuous line) by 60=3(00) — 0.1155(5) -t- 0.35 ln(n)/n 
and 6^,^4(00) = 0.266(1) + 0.631n(n)/n. We found that 
the data for the anti-ferromagnet for smaller n varies 
strongly with the way the a-valent graph was formed 
(here we used the method described in Ref. Q) and is 
difficult to fit. It is apparent, though, that the differ- 
ence between the data for the spin glass and the anti- 
ferromagnet is decreasing for n — > 00. 
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performance in approximating ground state energies at a 
fixed runtime is obtained for a value of Topt slightly larger 



than 1, and 



opt 



1+ for 



oo. (Note that t — 1 



would be a poor choice! Practical values for, say, an 
n = 1000 random graph typically range from Topt ~ 1-1 
for spin glasses to Topt ~ 1-6 for bipartitioning.) In fact, 
the (more extensive) numerical data presented in Ref. 
suggested a simple argument that yields 



Topt 



1 



V m n / 

Inn 



[n oo, ln(n) <C a ^ n), (6) 



where tmax — an was used as the maximum number of 
updates for a single EO run. This asymptotic behavior 
was justified by placing Topt at the "edge to ergodicity," 
a point between having r large enough to descent into 
local minima while having r just small enough to not get 
trapped inside the basin of any local minimum. In the 
following we present a model to make this notion more 
concrete. 



III. EVOLUTION MODELS 

We can abstract the random glass problem in Sec. || 
into a simple model which demonstrates previous ob- 
servations about T-EO in an analytically tractable way. 
Consider the spin system on an a-valent graph. Each 
spin i can be in one of a + 1 states A^: either no adjacent 
bond is violated and i is among the uq spins, only one 
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FIG. 2: Plot of the number of violated bonds per spin 
(e) as a function of t as obtained by t-EO for a ± J-spin 
glass on trivalent graphs. Shown are the results for e av- 
eraged over 4 runs each on a set of 20 graphs for n = 128, 
256, and 512. While the results clearly get worse rapidly 
for T < 1, even for r <^ 1 a decline in the quality can 
be observed. (The weak dependence of e for large r may 
indicate that a greedy approach to finding ground states 
will yield good approximations ||3l|.) Despite of the slow 
variation with t, the value of Topt where (e) is minimized 
clearly decreases toward r = 1"*" with increasing n, con- 
sistent with Eq. (^). 



bond is violated placing it among the ni spins, and so 
forth up to the Ua spins which have all their adjacent 
bonds violated. Thus, one can define densities for each 
of the a + 1 states, pi = nt/n {i = 0, . . . ,a). In general, 
we can interpret any local search procedure, which only 
updates a single variable at a time, simply as a set of 
evolution equations for the pi(t), to wit 



Pi 



(7) 



Here, Qj is the probability that a spin in state j gets 
updated, and the matrix T^j specifies the net transition 
to state i given that a spin in state j is updated. Note 
that conservation of probability requires 



and conservation of variables requires 

E^^^ = o (0<J<«)- 



(8) 



(9) 



Both, T and Q may generally depend on the pi (t) as well 
as on t explicitly. (For instance, for simulated annealing 
with a temperature schedule, the Qi could depend ex- 
plicitly on t through the changing temperature.) 
Another relation is provided by the constraint 



(10) 



which implies pi = 0. Thus, one of the equations 
in (0) is always redundant. The cost per variable to be 
minimized in Eq. (H) now reads 



(11) 



with e = being optimal. 

The advantage of this notation lies in the fact that the 
average update preference, Q, is separate from the up- 
date process described by T. For instance, for a random 
walk (equivalent to r-EO at t = or simulated annealing 
at high temperature) Qj (t) = pj {t) , since the probability 
that a spin in state i gets chosen for an update is equal 
to the number of those spins, no matter how that update 
is processed by T. What is typically unknown for a hard 
problem is the general form of T. But to understand the 
properties of a heuristic expressed through Q, it may be 
revealing to "design" interesting T. 



A Annealed approximation to the glass problem 

We can construct T for the glass problem in Sec. |^on 
a trivalent graph in an annealed approximation. Since T 
in this case is quite messy, and of no great consequence 
beyond this Section, we focus on one of its components, 
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say, Ti_2- This component represents the net flux in or 
out of pi, given that a variable in state 2 gets updated. 
The annealed approximation consists of the unbiased as- 
sumption that each of the a = 3 neighboring vertices 
can be in state i with probability pi independently. Of 
course, no neighboring vertex can be in state 0, if the 
bond to it is violated, or in state a, if the bond to it is 
good. 

For Ti,2, the vertex chosen for an update has 2 vio- 
lated bonds and a — 2 = 1 good bond. First, when that 
vertex flips, there is a shift of one variable (fraction 1/n) 
from p2 to pi . The neighboring vertex on the other end 
of each of the violated bonds could be in state 1, 2, or 3 
with probability pi, p2, or pa, respectively, and the ver- 
tex attached via the good bond could be in state 0, 1, 
or 2 with probability po, pi, or p2, respectively. Consid- 
ering all allowed combinations, we can find the relative 
(unnormalized) influx into any of the pi as a consequence 
of updating the vertex at the center. The sum of the in- 
fluxes should equal the fraction of moved vertices, a/n, 
and the relative influxes can be normalized accordingly. 
Finally, one can identify for each of the combinations 
where that fraction of moved vertices originated from, 
which leads to negative out-flux to the Ti.2 [which is ob- 
viously required to satisfy Eq. (||)]. The out-flux out of 
state i must be proportional to pi. Thus, we obtain the 
following three terms contributing to Ti^2- 

y _ 1_ ^ PO Pl + 2 pi P2 + PO P3 + 2 /92^ + 3 po P2 

n n- (1 - po) (1 - Ps) 

(3pi-H3p2 + P3 + 2po)pi ^^2^ 
n(l - po) (1 - ps) 
and the construction of the other elements of T in this 
annealed approximation proceeds equivalcntly. 

It is not too hard to obtain some steady state (p — 0) 
results for Eqs. (|^) with this particular T, supplemented 
by Eq. (]lO|). One example would be the random walk 
limit, Qi = Pi, equivalent to r = 0. More revealing for 
the analysis of EO is the r ^ oo limit. In that case, on 
each update only one among the worst spins gets flipped. 
From some random initial conditions, EO would empty 
out state 3 first (Q3 = 1, Q2 = Qi = Qa = 0), than 
empty out 2, and so on, until a steady state is reached 
with the highest non-empty state being pj with some 
j > 0. In this steady state, we can try to determine the 
Pi(oo) with the ansatz Qi = Ci, = 1, where the 

average is taken over time. The only consistent balance 
is obtained with state 3 totally empty, p3(oo) = and 
C3 = 0, and state 2 almost empty except for a single 
spin reaching the state sometimes, i. e. p2(oo) « and 
C2 > 0. Hence, cq = 1 — ci — C2 and pi — 1 — po, which 
leads to a drastically simplified equations: 

= C2(3 + 2po) + ci(2 + Po) - (1 + 3po), 

= C2(l - 4po) - ci(l -f 2po) - (3 - 6po), 

= -C2(3-2po) + cipo + 3(l-po), (13) 



all other equations being redundant. The solution is sim- 
ply 

po(oo) = 1, pi(oo) = 0, 

ci=2> C2 = -, (14) 

consistent with numerical simulation for all initial con- 
ditions. Thus, in the steady state, almost all variables 
are in the ground state except for a single vertex that is 
being bounced between state 1 and 2. 

The result that EO converges to the ground state for 
T = 00, while reassuring, is not very helpful to under- 
stand either EO or the original problem. The annealed 
approximation has eliminated everything that made the 
problem interesting, and EO's convergence for r = 00 
to a perfectly optimized ground state clearly does not 
resemble our numerical results from Sec. ||. 

B Models with very simple flows 

Our naive annealed approximation has eliminated 
most of the relevant features of the original, hard prob- 
lem. Not surprisingly, it also fails to predict the existence 
of a finite value for Topt (see Fig. ||); it is easy to convince 
oneself that t = c» is in fact the best case scenario for 
T-EO for all initial conditions and even at finite runtime. 
Yet, two basic features of the evolution equations remain 
appealing: (1) The behavior of a system with a large 
number of variables can be abstracted into a relatively 
simple set of equations, describing their dynamics with 
a small set of unknowns, and (2) the separation of up- 
date process, T, and update preference, Q, lends itself 
to an analytical comparison between different heuristics. 
This distinction is possible, of course, only as long as 
these heuristics can utilize the same single- variable, local 
search process in T. The question is: Can we construct 
interesting processes T in the sense that they capture 
salient features observed for local search on real, NP- 
hard problems? We will show that even the most basic 
versions of T provide some insights into the workings of 
various local search heuristics. 

For simplicity, we choose a as small as possible for 
the three following model situations. Without restric- 
tion of generality, in these cases a = 2 is sufficient, but 
more complicated phenomena could be accommodated 
with more states. First, we consider the most trivial case 
where a variable when updated merely moves from state i 
to state i — 1 for i > 0, or from state to state a (to make 
every state accessible), Ti J = [-^i,j+^i,(a-i-j mod a+i)]/"- 
This process is conveniently depicted as a fiow chart in 
Fig. |a. Clearly, any gradient descent method will be 
able to reach the ground state e = for this process, 
since there are no barriers. For instance, simulated an- 
nealing with zero temperature will reach this state in 
0{n) trials, and r-EO for t = 00 will reach e = in < n 
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steps, when averaged over initial conditions. [Note that 
in the above notation, Ci = 1/4 solves the steady state 
equations where Cq > implies po{oo) — 1, pi>o(oo) — 0.] 



Again, 



'opt 



cxD is obvious. In fact, this model can be 



solved readily for any r with the methods to be developed 
below in Sec. Ill C| . For the random walk limit, r = 0, it 
is Ci = Pi{oo) = 1/4 since Qi = pi. 

Next, we can reverse the directions of transitions in 
the previous example to obtain a less trivial case, which 
now possesses energetic barriers. Here Tij — [—Sij + 
^{a+i mod a+i),j]/'^j BiS dcpictcd in Fig. ||b. Remarkably, 
the previous analysis for t-EO (at least, for r = or 
oo) does not change. The e = steady state is reached 
again in < n steps for r = oo, since EO does not reject 
uphill moves which are required here to arrive at state 
through state 2, and Topt — oo again. On the other 
hand, it is quite clear that simulated annealing will not 
arrive at e = with finite probability in polynomial time, 
even for a sophisticated temperature schedule. Such en- 
ergetic barriers are, of course, an inherent feature of many 
NP-hard problems, which makes this simple model quite 
revealing. 



C Model with jammed flow 

Naturally, the range of phenomena found in a local 
search of NP-hard problems is not limited to energetic 
barriers. After all, so far we have only considered con- 
stant entries for Tij. Therefore, in our next model we 
want to consider the most simple case of T depending 
linearly on the p^'s. Most of these cases reduce to the 
phenomena already discussed in the previous examples. 
A entirely new effect arises in the following case, also 
depicted in Fig. ||c: 



Po = - 

n 



Pi = - 
n 



P2 = - 

n 



^Qo~Qi + {0-pi)Q2 



Pl)Q2 



1 = Po + Pi + P2- 



(15) 



Aside from the dependence of T on pi, we have also 
introduced the threshold parameter 0. In fact, if > 1, 
the model behaves effectively like the previous models, 
and for < there can be no flow from state 2 to the 
lower states at all. The interesting regime is the case 

< < 1, where further flow from state 2 into state 

1 can be blocked for increasing pi, providing a negative 
feed-back to the system. In effect, the model is capable of 
exhibiting a "jam" as observed in many models of glassy 
dynamics ||3^, ^ , and which is certainly an aspect of 
local search processes. 





Flow down 



Flow up 



Flow jam 



FIG. 3: Plot of the flow diagrams for the different mod- 
els discussed in the text. Diagram (a) shows a situation 
in which variables in higher states always evolve toward 
lower states (except for the lowest state flowing up). In 
diagram (b), variables have to jump to higher energetic 
states first before they can attain the lowest state. Dia- 
gram (c) shows the model of a jam, where variables in the 
highest state can only traverse through the intermediate 
state to the lowest state, if the intermediate state moves 
its variables out of the way first to keep its density pi 
below the threshold 6. The states have energies that in- 
crease from the bottom up, the p's mark the occupation 
density of each state, and arrows out of a state indicate 
the rates n Ti^j at which a variable flows from state j into 
another state i, if a variable in state j gets updated. 



We proceed to calculate the unique fixed point of the 
system for EO with arbitrary r. In the general case, the 
Q's depend on the p's in a more complicated way. As 
described in the numerical simulation of the glass on a 
random graph in Sec. ||, each update a spin is selected 
based on its rank according to the probability distribu- 
tion in Eq. (H). When a rank k{< n) has been chosen, a 
spin is randomly picked from state a, if k/n < pa, from 
state a — 1, if pa < k/n < Pa+Pa-ii and so on. We intro- 
duce a new, continuous variable x — k/n, approximate 
sums by integrals, and rewrite P{k) in Eq. (0) as 



p{x) = 



,r-l 



1 



- < a; < 1 
n 



(16) 



where the maintenance of the \ow-x cut-off at 1/n will 
turn out to be crucial. Now, the average likelihood that 
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a spin in a given state is updated is given by 

1 



Qa — 



ia-1 



p{x)dx 

1/n 



1-n 
p{x)dx 



1 



1 - n^-i 
1 



(17) 



p{x)dx 



i-po 



1 - 



where in the last hne the norm pi — 1 was used. 
These values of the Q's completely describe the update 
preferences for r-EO at arbitrary t. 



Inserting the set of Eqs. (18) for a = 2 into the model 
in Eqs. (p^, we obtain 



1 



Pa = 



Pi 



n{l -n^-i) 
1 

n (1 - n^-i) 



n (1 - n^-i) 



2 



1 = PO + Pi + P2- 



P2 



(^? - pi) {pI-^ n^-') 



(18) 



We abbreviate A — {1 — po)^ and B = p\ to obtain 
for the steady state, p = 0: 







(0-pi)(S-n^-i) 
pi)(S-n--i), 



2 2 



(19) 



One of the first three equations is redundant, and we 
obtain 



= -(A - 1) + [6* - ^1/(1-^) + (3A - 2)1/(1-^) 



where 



(3A-2-n^-i) , 
P2 = (3A- 2)1/(1-), 

Pl = 1 - PO - P2- 



(20) 



(21) 



The implicit Eq. (|2C|) has some remarkable properties. 
It has a single physical solution for the p's for all < 



r < oo, < < 1 '35 1, and all n. In particular, in the 
thermodynamic limit n — > cxd a critical point at r = 1 
emerges. If r < 1, the n-dependent term in Eq. ( [20| ) 
vanishes, allowing and hence the p's, to take on finite 
values, i. e. e > 0. If r > 1, the rt-dependent term 
diverges, forcing A to diverge in kind, resulting in po — 1 
and Pi ^ for i > 0, i. e. e 0. This behavior of eir) 
for various n is shown in Fig. ^. 

Having a unique fixed point solution seems to be the 
last word on this problem, with r = cxi again being the 
most favorable value at which the minimal energy e = 
is reached for sure. But it can be shown that the system 
has an ever harder time to reach that point, requiring 
typically t = Oin^) update steps for a finite set of initial 
conditions. Thus, for a given finite computational time 
^max the best results are obtained at some finite value of 
Topt- In that, this model provides a new feature — slow 
variables impeding the dynamics of faster ones |Q — 
resembling the observed behavior for EO on real prob- 
lems, e. g. the effect shown in Fig. ^. In particular, this 
model provides an analytically tractable picture for the 
relation between the value of Topt and the effective loss 
of ergodicity in the search conjectured in Refs. jlj, |2^. 

The generic evolution of the jamming model for r > 1 
is as follows: If the initial conditions place a fraction po > 
\ — Q already into the lowest state, most likely no jam will 
emerge, since pi(i) < Q for all times, and the ground state 
is reached in < n steps. But if initially pi -l-p2 > Q^ and r 
is sufficiently larger than unity, EO will drive the system 
to a situation where pi « by transferring variables from 
P2 to pi. Then, further evolution becomes extremely 
slow, delayed by the r-dependent, small probability that 
a variable in state 1 is updated ahead of all variables in 
state 2. The typical situation in that case is depicted in 
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FIG. 4: Plot of e = ipi/2 as a function of r resulting 
from the solution of Eqs. ( |2C| , |2l] ) for 6* = 1/2 and various 
values of n. For n oo, a sharp transition emerges at 
T = 1, giving optimal results e — > for all t > 1. But for 
T > 1 this steady state is reached only for suitable initial 
conditions, or after sufficient time, see Fig. ^. 
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Fig. |. 

Clearly, the jam is not a steady state solution of the 
evolution equations in (^. [It is not even a meta-stable 
solution since there are no energetic barriers. For in- 
stance, simulated annealing at zero temperature would 
easily find the solution in t = 0{n) without experienc- 
ing a jam. In reality, a hard problem would most cer- 
tainly contain combinations of jams, barriers, and possi- 
bly other features.] But Fig. || suggest the right asymp- 
totic approach to evaluate the long-time behavior of the 
jam: Consider that we start with initial conditions lead- 
ing to a jam, pi(0) -I- P2{0) > d. We can assume that 



p^{t) = e-e{t) 



(22) 



with e <C 1 for t < tjam, where tjam is the time at which 



P2 gets small. To determine ijam, we apply Eq. (|2^) to 
the evolution equations in (|l^) to get 



Po 

TV 



1 

2 " 



1 1-r 
2^2 



P2- 



1-T 

P2 



en 



(23) 



Here, we have already dropped one of the equations (for 
P2) which was redundant. Now, we also can disregard 
the equation containing e, its importance being that it 
determines the first-order correction, e = 0{n}~'^), con- 
sistently as a function of the leading order contributions 




50000 100000 150000 

t 

FIG. 5: Plot of the typical evolution of the system in 
Eqs. (18) for some generic initial condition that leads 
to a jam. Shown are po(t), pi(t), and P2it) for n — 
1000, T = 2, 6* = 0.5 and initial conditions po{0) ^ 0.2, 
pi(0) = 0.35, and p2{0) = 0.45. Since pi(0) < 6, pi fills 
up to 9 almost instantly with variables from p2 while po 
stays constant. After that, pi ~ 9 for a very long time 
(3> n) while variables slowly move down through state 
1. Eventually, after t — 0{n'^), p2 vanishes and EO can 
empty out pi directly which leads to the ground state 
Po = 1 (e = 0) almost instantly. 



of po(0 ^^'^ P2{t). Using the last (norm) equation and it's 



derivative to leading-order, po 
an equation solely for p2{t), 



-p2, we finally obtain 



P2 



1 



P2) 



-A 



(24) 



or, using the fact that p2 almost instantly takes on the 
value of pi(0) -f p2(0) - 61 = 1 - 61 - po(0) (see Fig. |), 



t - 71^ 



i-e-po(o) 



2d^ 



P2(t) 



2 - ?,{9 + iY'^ + e 



(25) 



We can estimate the duration of the jam, ijam, by setting 
P2(tjam) « 0, see Fig. |: 



nVr(l 



Po(0)), 



where we defined 



Ux) 



2di 



(26) 



(x > 0). (27) 



Thus, the duration of the jam scales with n'^ times a 
constant that depends on r, 9, and the initial condi- 
tions. As stated before, if the initial conditions keep 
pi(0) + P2(0) < 9, most likely there will be no jam, re- 
fiected in the fact that /r(a;) goes to zero for x ^ 0. The 
asymptotic scaling in Eq. ( p6| ) conforms well with our 
numerical simulations: with /2(0.3) « 0.16 and n = 1000 
we obtain i^ax ~ 1-6 10^, in good agreement with Fig. ||. 

The long-lived jams that occur for t > 1 will have a 
significant effect on the outcome of a local search with 
EO, which proceeds merely with a finite runtime imax- 
For instance, for t,nax = 0{n) there are always some ini- 
tial conditions for which the jam can not be resolved 
before imax, resulting in e > 0. Thus, the t-EO imple- 
mentation faces two confiicting priorities: On one side, 
larger r increase the quality of the steady-state result for 
e, away from the random- walk- like behavior at r < 1, 
see Fig. ^. On the other side, r > 1 increases the chance 
to get locked into a jam and never to reach that steady 
state in finite runtime, see Fig. |^. In between these con- 
flicting interests, we find a preferred value for Topt that 
averts both, the jam and the random walk, such that (e), 
averaged over initial conditions, is minimized. 

Let us assume we fix the runtime to be tmax — an^ 
where a is a constant with a ^ n, so that n < ^max ^ n'^ 
for r > 1. If we had chosen r < 1 for our implementation, 
there are no jams but we are sure to obtain less than 
optimal results for (e) as in Fig. ^, so we will assume 
r > 1. In this case, we have to distinguish between three 
possible outcomes to a single run of the EO algorithm, 
depending on the initial conditions: (1) If pi(0)+p2(0) < 
9, the run will most certainly reach the optimal state, 
e = 0, within t^ax updates, (2) even if pi(0) + P2(0) > 9 
but tmax ^ ^jam from Eq. (|26|), e = may be reached. 
Only if (3) pi(0)+p2(0) > 9 andtmax ^jam are satisfied, 
the search will get stuck in a state of e > 0, with a value 
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that depends on the initial conditions. Averaging over 
all initial conditions, we find 



(e) 



dpo dpi dp2 (5 (1 - po - Pi - P2) 



1 

A^7o " ^' '^^ '''' 2 

u{l-9-p^) u{U{l-e- pQ)n^ -t 




where u{x) is the Heaviside step- function and 5{x) is the 
Dirac delta-function. The norm is given by 



N ■ 



/ dpo dpi dp2 S {1 - po - pi - P2) 
Jo 



(29) 



Hence, we obtain 



max{o,l-e-/^Htmax/n^)} 



dpo (l-po)'(30) 



The average energy (e) in Eq. ( pO[ ) will start to rise 
for increasing r as soon as the upper integration limit 
becomes non-zero, or when fmax ~ /t(1 — 0)r7,'^. If 
imax ^ /t(1 — 9)n'^, i. e. for r 1, Eq. ( pO|) predicts for 
the average energy (e) = (1 — 0'^)/2. 

Since (e) will reach its minimum value right before the 
onset of jams cause its rise, we can use this relation to 
estimate the optimal value of r. In effect, this justifies 
the connection between Topt and the "edge of ergodicity" 
noted in Ref. [Q. Since the dependence of fr on r is 
much weaker than the exponential n'^ , we can write 



'opt 



In (W//r(l - g)) 
In 71 ' 
\nia/Ml-e)) 



1 



In n 



(31) 



where we have used our choice tmax = an. In recognition 
of the fact that t —^ for n 00, we can simplify the 
last expression by expanding fr (x) in that limit to get 



T - 1 



d^ 



In 



(r ^ 1). (32) 



Note that the pole at r = 1 is a generic consequence of 
Eq. (H), independent of the choice of the particular Ti,j 
depicted in Fig. ||c. If we insert Eq. ( ^2[ ) into Eq. ( [31^ , 
we exactly reproduce the n dependence given in Ref. P5| , 
see Eq. (^ above. Numerical, we get at 9 — 1/2 for 
/^(l/2) « (21n2-l)/(T-l), and using a = 100 and n = 
10, 100, 1000, and 10 000, Eq. (|3|) predicts Topt « 3.5, 
2.1, 1.6, and 1.4. 

We can compare this prediction with numerical simula- 
tions of T-EO applied directly to the jamming system de- 
scribed in Fig. ||c [not just the evolution equation in ( [T^ ) 
that use averaged probabilities Q]. In Fig. ^ we show the 
results for (e) as a function of r for n = 10, 100, 1000, 
and 10000 at tmax — lOOn. Initially, for t < 1, (e) reaches 
the steady state result from Fig. |^ for any initial condi- 
tion. But, as predicted, (e) reaches a minimum at a Topt 




FIG. 6: Plot of the energy (e) averaged over many r- 
EO runs with different initial conditions as a function 
of r for n = 10, 100, 1000, and 10000 and 6* = 1/2. 
For small values of r, (e) closely follows the steady state 
solutions plotted in Fig. ^. It reaches a minimum at a 
value near the prediction for Topt ~ 3.5, 2.1, 1.6, and 1.4, 
and rises sharply beyond that. It reaches an asymptotic 
value approaching the prediction of (e) « 0.44 for r —> 

QO. 



beyond which it starts to rapidly deviate from the steady 
state solution. This is the "ergodic edge" beyond which 
unresolved jams effect the observed value of (e). Our pre- 
diction for Topt appears to become increasingly accurate 
for n — > 00. Furthermore, for r — > 00, Eq. (|30| ) predicts 
(e) 7/16 « 0.44 for 6 — 1/2, in reasonable agreement 
with the numerical value seen in Fig. ^. 



IV. CONCLUSION 

We have presented a simple model to analyze the prop- 
erties of local search heuristics. This model was applied 
to extremal optimization and we found conditions un- 
der which EO exhibited the same phenomenology on the 
model as it does on real combinatorial optimization prob- 
lems as exemplified here by a frustrated spin system on 
a random graph. The analytical results from the model 
in Eq. (pT ) closely resemble Eq. (||), the prediction from 
Rcfs. g||. 

Of course, the model is tailored more toward under- 
standing the EO mechanism and does not nearly repre- 
sent all of the features of a hard optimization problem. 
[After all, it takes EO only 0{n'^) updates to find the 
ground state in the worst case for the model.] Thus, 
finding a non-trivial value for Topt in the model merely 
provides an analogy. For instance, Topt is somewhat de- 
pendent on the relationship of tmax to n. If imax = 0{'n}) 



then 



■opt 



1+ for 



00 according to the model. In 



this regard the analogy seems to hold in every respect for 
the graph bipartitioning problem in Refs. h% ^ where 
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But in our numerical simulations for spin glass 



systems in Sec. || displayed in Fig. |2[ or in Ref. ||5||, typi- 
cally 0{n^~"^) updates were required to obtain consistent 
results for increasing n, yet Topt — > 1^ was found irre- 
spective. 

We believe that our observation for the behavior of EO 
is quite robust under variation of the entries for T. More 
complicated choices for T (which may be analytically 
less tractable) could be made to represent hard prob- 
lems more closely. In this sense, the separation between 
T and Q allows to study comparisons between different 
update modes, and even with other local search proce- 
dures. As our examples in Figs, ^-c show, simple choices 
of T can lead to interesting scenarios, although there is 
no real frustration. For instance, one could analyze the 
properties of EO for different choices of the probability 
distribution over the ranks in Eqs. (|5|Jl^). 

It is more difficult to construct the Q's for simulated 
annealing. Let's assume that we consider a variable in 
state j for an update. Certainly, Qj would be propor- 
tional to pj, since variables are randomly selected for an 
update. The Boltzmann factor e~^^^^ for the potential 
update of a variable in j, aside from the inverse temper- 
ature only depends on the entries for Ti,j: 



AE, = nAe, = - 



3 



(33) 



where the subscript j expresses the assumption that a 
variable in state j is considered for an update. Hence, we 
find for the average probability of an update of a variable 
in state j 



Qj oc Pj min < 1 , exp 



(34) 



which is still short of a proper normalization. Similarly, 
comparisons with other methods such as threshold an- 
nealing 1^ can be considered. 
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